White wine is a wine whose colour can be straw-yellow, yellow-green, or yellow-gold. It is produced by the alcoholic fermentation of the non-coloured pulp of grapes, which may have a skin of any colour. Common tests include °Brix, pH, titratable acidity, residual sugar, free or available sulfur, total sulfur, volatile acidity and percent alcohol.
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
##
## The downloaded binary packages are in
## /var/folders/z2/8bdjfjqx7h90_c2x8tbjkrgh0000gn/T//RtmpYpULdR/downloaded_packages
## 'data.frame': 4898 obs. of 13 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ fixed.acidity : num 7 6.3 8.1 7.2 7.2 8.1 6.2 7 6.3 8.1 ...
## $ volatile.acidity : num 0.27 0.3 0.28 0.23 0.23 0.28 0.32 0.27 0.3 0.22 ...
## $ citric.acid : num 0.36 0.34 0.4 0.32 0.32 0.4 0.16 0.36 0.34 0.43 ...
## $ residual.sugar : num 20.7 1.6 6.9 8.5 8.5 6.9 7 20.7 1.6 1.5 ...
## $ chlorides : num 0.045 0.049 0.05 0.058 0.058 0.05 0.045 0.045 0.049 0.044 ...
## $ free.sulfur.dioxide : num 45 14 30 47 47 30 30 45 14 28 ...
## $ total.sulfur.dioxide: num 170 132 97 186 186 97 136 170 132 129 ...
## $ density : num 1.001 0.994 0.995 0.996 0.996 ...
## $ pH : num 3 3.3 3.26 3.19 3.19 3.26 3.18 3 3.3 3.22 ...
## $ sulphates : num 0.45 0.49 0.44 0.4 0.4 0.44 0.47 0.45 0.49 0.45 ...
## $ alcohol : num 8.8 9.5 10.1 9.9 9.9 10.1 9.6 8.8 9.5 11 ...
## $ quality : int 6 6 6 6 6 6 6 6 6 6 ...
## X fixed.acidity volatile.acidity citric.acid
## Min. : 1 Min. : 3.800 Min. :0.0800 Min. :0.0000
## 1st Qu.:1225 1st Qu.: 6.300 1st Qu.:0.2100 1st Qu.:0.2700
## Median :2450 Median : 6.800 Median :0.2600 Median :0.3200
## Mean :2450 Mean : 6.855 Mean :0.2782 Mean :0.3342
## 3rd Qu.:3674 3rd Qu.: 7.300 3rd Qu.:0.3200 3rd Qu.:0.3900
## Max. :4898 Max. :14.200 Max. :1.1000 Max. :1.6600
## residual.sugar chlorides free.sulfur.dioxide
## Min. : 0.600 Min. :0.00900 Min. : 2.00
## 1st Qu.: 1.700 1st Qu.:0.03600 1st Qu.: 23.00
## Median : 5.200 Median :0.04300 Median : 34.00
## Mean : 6.391 Mean :0.04577 Mean : 35.31
## 3rd Qu.: 9.900 3rd Qu.:0.05000 3rd Qu.: 46.00
## Max. :65.800 Max. :0.34600 Max. :289.00
## total.sulfur.dioxide density pH sulphates
## Min. : 9.0 Min. :0.9871 Min. :2.720 Min. :0.2200
## 1st Qu.:108.0 1st Qu.:0.9917 1st Qu.:3.090 1st Qu.:0.4100
## Median :134.0 Median :0.9937 Median :3.180 Median :0.4700
## Mean :138.4 Mean :0.9940 Mean :3.188 Mean :0.4898
## 3rd Qu.:167.0 3rd Qu.:0.9961 3rd Qu.:3.280 3rd Qu.:0.5500
## Max. :440.0 Max. :1.0390 Max. :3.820 Max. :1.0800
## alcohol quality
## Min. : 8.00 Min. :3.000
## 1st Qu.: 9.50 1st Qu.:5.000
## Median :10.40 Median :6.000
## Mean :10.51 Mean :5.878
## 3rd Qu.:11.40 3rd Qu.:6.000
## Max. :14.20 Max. :9.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.000 5.000 6.000 5.878 6.000 9.000
##
## 3 4 5 6 7 8 9
## 20 163 1457 2198 880 175 5
The above graph shows that the most of the wines have quality rate between 5 and 7. There are only 5 wine samples that are of highest quality and only 20 wines of least quality.
##
## low medium high
## 183 4535 180
The acids in wine are an important component in both winemaking and the finished product of wine. They are present in both grapes and wine, having direct influences on the color, balance and taste of the wine as well as the growth and vitality of yeast during fermentation and protecting the wine from bacteria.
Below histogram shows the distribution of fixed acidity, volatile acidity and citric acid in all the wines.
There are three types of acidity given in the dataset - Fixed acidity, Volatile Acidity and Citric Acid The three primary acids found in wine grapes are tartaric, malic and citric acids. Most of the acids involved with wine are fixed acids with the notable exception of acetic acid, mostly found in vinegar, which is volatile and can contribute to the wine fault known as volatile acidity. Acetic acid in wine, often referred to as volatile acidity (VA) or vinegar taint, can be contributed by many wine spoilage yeasts and bacteria. From graph we can see that Fixed acidity ranges from 14 gm/dm^3, with maximum at 7 gm/dm^3. Whereas, Volatile acidity ranges from 0 to 1 with max between 0.2 and 0.3 gm/dm^3. From this we can presume that excess of volatile acid can spoil the wine. Citric acid is found only in very minute quantities in wine grapes.These inexpensive supplements can be used by winemakers in acidification to boost the wine’s total acidity. It is used less frequently than tartaric and malic due to the aggressive citric flavors it can add to the wine. The graph shows Citric acidity ranging between 0 and 1.6 with very few wines having more than 0.6 gm/dm^3 of citric acid. A wine with too much acidity will taste excessively sour and sharp. A wine with too little acidity will taste flabby and flat, with less defined flavors Hence we cans ee that most of the wines fall in average range of acidity.
The strength of acidity is measured according to pH, with most wines having a pH between 2.9 and 3.9. Generally, the lower the pH, the higher the acidity in the wine. However, there is no direct connection between total acidity and pH (it is possible to find wines with a high pH for wine and high acidity). Winemakers use pH as a way to measure ripeness in relation to acidity. Low pH wines will taste tart and crisp, while higher pH wines are more susceptible to bacterial growth. Most wine pH’s fall around 3 or 4; about 3.0 to 3.4 is desirable for white wines
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.720 3.090 3.180 3.188 3.280 3.820
The graph shows that most of the wines has pH in between 3.0 and 3.28 with average pH at 3.18.
The subjective sweetness of a wine is determined by the interaction of several factors, including the amount of sugar in the wine, but also the relative levels of alcohol, acids, and tannins. Sugars and alcohol enhance a wine’s sweetness; acids (sourness) and bitter tannins counteract it.
White wine is made from white or black grapes (but always with white flesh, the grapes with coloured flesh are called Teinturier meaning coloured juice). Once harvested, the grapes are pressed and only the juice is extracted which is called wort. The wort is put into tanks for fermentation where sugar is transformed into alcohol by yeast present on the grapes.
Here the datasets provides the information about the percent alcohol content of the wine.
Below histogram shows the distribution of alcohol content in all the wines.
Medium Alcohol Wines - Wines ranging from 11.5%–13.5% ABV
Medium-High Alcohol Wines - Wines ranging from 13.5%–15% ABV.This is the average range of dry American wines and other warm climate growing regions including Argentina, Australia, Spain and Southern Italy. Regions with warmer climates will produce sweeter grapes which in turn increases the potential alcohol content of the wine.
High Alcohol Wines - Wines Over 15% ABV
The first thing to understand about sulfites is that they bind with other things in wine. They bind with micro-organisms, oxygen, solids, yeast, acids, bacteria, and sugars. When this chemical bond happens the sulfite goes from being free to bound. Bound sulfite has already done its job and while it is still in wine it is not free to bind with anything else. Thus we have two different sulfite levels to worry about, free and total.
Free Sulfur Dioxide: A wine needs to be protected against many things that can spoil it. Protection comes only from free sulfites. It prevents microbial growth and the oxidation of wine
Total Sulfur Dioxide: amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine
Thus it is required to know how much sulphite is there in wine already that is free and how much free sulfites we would like to have. When adding sulfites to wine, usually in the form of potassium metabisulfite, some of it will become bound while the rest will remain free. One can’t predict how much will become bound so winemakers add potassium metabisulfite, test it, then adjust as necessary.
The effectiveness of sulfites change with the pH of the wine. The higher the pH the more sulfites is needed to do the same job as it would in a wine with a lower pH. The maximum allowable doses depend on the sugar content of the wine: the residual sugar is susceptible to attack by microorganisms which would cause a restart of fermentation.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.2200 0.4100 0.4700 0.4898 0.5500 1.0800
The amount of salt in the wine
The density of wine is close to that of water depending on the percent alcohol and sugar content
There are 4898 observations of wine with 12 variables (11 numeric physicochemical properties and one integer expert review). Other observations: Most of the wine have quality rate of 5, 6, 7 Most of the wines have pH between 2.80 and 3.47 Median alcohol amount is 10.40% Average sugar amount is 6.391 g/dm^3 with the maximum 65.80
I find all the variables important to analyse the datasets. As studied, every chemical property adds to wines quality. However, I would like to focus more on Acidity, Sugar, Alcohol and Sulphates in wine.
My interest in this is to analyse what chemical properties and features contribute to high quality wines and what quantity of chemical properties are reason for low quality wines. I think that relationship between each property could define its quality and taste.
Yes, I categorized quality into ‘low’, ‘medium’, high’ levels. The wines with rating 3 and 4 are of low quality, the wines with rating 5,6 and 7 are of medium quality and the wines with rating 8 and 9 are of high quality.
It is found that in this dataset, every chemical property in the dataset is normally distributed, except Residual Sugar and Alcohol. The residual Sugar has positively skewed distribution and Alcohol has multimodal distribution.
Data is in tidy form and hence no changes have been made.
If we see the median of low, medium and high quality wines, fixed acidity of medium and high quality wines is slightly lower than that of low quality wines
Volatile acidity of low quality wines are higher than that of medium and high quality wines.
There is not much significant difference noticed in citric acidity of low, medium and high quality wines.
The boxplot shows that more percentage of high quality wines have comparatively high pH value.
Graph shows that high quality wines are little sweeter than medium and low quality wines.
High quality wines have distinctively high alcohol with Alcohol level more than 11%.
The graph shows that wine quality does not depend on Potassium sulphate content.
The above boxplot shows that Free Sulfurdioxide is much less in low quality wines.
However, there is not much difference noticed in total sulfurdioxide content of wines with different quality
The above graph shows that wines with high quality has less Sodium chlorides. High quality wines do not have Sodium Chloride more than 0.04 g/dm^3 whereas low quality wines have Sodium Chloride more than 0.038 g/dm^3.
As outliers were affecting the data, limited the y axis in density vs quality class graph. It is noticed that high quality wines have density not more than 0.9937 g/cm^3 i.e thigh quality wines are of comparatively lower density.
The data summary and graph shows that high quality wines have high alcohol level as well as high residual sugar. However there is negative coefficient corelation between Alcohol and Residual Sugar. This is an interesting relationship to analyse.
Other interesting relationship I noticed between Sodium chloride and Wine Quality
The strongest relationship found are - 1) Between alcohol and density 2) Residual Sugar and Density 3) Alcohol and Sodium Chloride
To analyze the impact of various chemical properties and its relationship in defining quality levels, I have subset the dataset to have only high and low quality wines.
Tha graph shows negative corelation between alcohol and density. Low quality wines are high in density and low in alcohol High quality wines are high in alcohol and low in density.
The above graph doesn’t show much relationship between alcohol and residual sugar. Low and high quality wines are almost equally distributed with residual sugar.
There is positive corelation between density and residual sugar. With increase in residual sugar, density increases. However, wines with same residual sugar having high density are low quality compared to high quality wines.
The above graph shows negative corelation between alcohol and chloride content of wines. Wines with high quality have high alcohol and comparatively less Sodium Chloride, whereas low quality wines have comparatively high Sodium chlories and less alcohol level.
From above graphs it can be noticed that 1) There is negative corelation between alcohol and density. Low quality wines are high in density and low in alcohol whereas High quality wines are high in alcohol and low in density. 2) There is no much relationship between alcohol and residual sugar. Low and high quality wines are almost equally distributed in terms of residual sugar content. 3) There is positive corelation between density and residual sugar. With increase in residual sugar, density increases. However, wines with same residual sugar having comparatively higher density are of low quality and that of lower density are of high quality. 4) There is negative corelation between alcohol and chloride content of wines. Wines with high quality have high alcohol and comparatively less Sodium Chloride, whereas low quality wines have comparatively high Sodium chlories and less alcohol level.
One interesting relationship I noticed is between alcohol level and Sodium Chloride content of wines. Wines with low sodium chloride have high alcohol level and are better in quality.
In plot one I have used boxplots to show the content of Alcohol level, density and Sodium Chloride in low, medium and high quality wines. From the graph we can see that high quality wines have comparatively high alcohol level, low density and low sodium chloride content.
The above graph is the scatter plot between Alcohol and Density for low and high quality wines. The wines with quality rating 8 and 9 are high quality wines shown by blue color dots and wines with quality 3 and 4 are low quality wines shown by peach color dots. From graph it can be seen that there is negative corelation between Alcohol and Density. Wines with high alcohol have low density and mostly high quality wines. Where as wines with low alcohol level have high density and are low quality wines.
The above graph is the scatter plot between Alcohol and Sodium Chloride for low and high quality wines. The wines with quality rating 8 and 9 are high quality wines shown by blue color dots and wines with quality 3 and 4 are low quality wines shown by peach color dots. From graph it can be seen that there is negative corelation between Alcohol and Sodium Chloride. High quality wines are mostly with low Sodium Chloride content and low quality wines have comparatively high chloride content.
From above analysis, I found that the quality testers have given preference to wines with comparatively high alcohol level. Though, initially I did not think Sodium Chloride to have any impact on quality level, with this analysis I do see that wines with high sodium chloride did not taste good to quality testers. Another interesting fact in the wine physicochemical properties, I noticed by exploring correlation of residual sugar, density and alcohol: sweater wine has more density and wine with the same sweetness has larger volume of alcohol with lower density.
Limitations- As the quality rating is provided by three testers for all wines, it will not be good to select wines based on only this analysis. However, this analysis gives pretty much idea on what physiochemical properties to look for when selecting any wine.
Referrence - http://winemakersacademy.com/potassium-metabisulfite-additions/